Beyond Supervised Learning
نویسنده
چکیده
Supervised learning is a powerful technique to build models that can associate data to given targets. Today this is a successful method that is widely adopted in the industry. However, supervised learning comes with limitations: It relies on costly, timeconsuming and error-prone manual labeling of a large set of examples. Moreover, animals do not seem to learn about objects through a teacher during their lifetime. It seems possible that much of the learning occurs in an unsupervised manner. I will illustrate two general ideas that show a path towards learning without annotation: self-supervised learning and unsupervised disentangling of factors of variations. Selfsupervised learning has emerged as a successful method to learn useful features without manual labeling. It exploits the inherent structure of the input data through so-called pretext tasks. I will give an overview of methods in the literature and introduce some state of the art methods developed in our group. In this context I will also present a method that allows to distill learned pretext tasks into a dataset of samples and pseudo-labels. For the first time it is then possible to compare handcrafted features, such as HOG, to other learned features in a common framework. I will also discuss a second approach to unsupervised learning, which aims at disentangling the main factors of variation of data. The idea behind this approach is to automatically cluster variability in the data that can be represented by the same attribute. For example, if data consists of faces, possible factors of variation are the gender, the hair style, the presence of glasses, beard, hats and scarfs, the pose, the expression, the skin color, and so on. I will present several techniques in the literature and from our group that allow identifying such factors in an unsupervised manner or with weak labels.
منابع مشابه
Distributional semantics beyond words: Supervised learning of analogy and paraphrase
There have been several efforts to extend distributional semantics beyond individual words, to measure the similarity of word pairs, phrases, and sentences (briefly, tuples; ordered sets of words, contiguous or noncontiguous). One way to extend beyond words is to compare two tuples using a function that combines pairwise similarities between the component words in the tuples. A strength of this...
متن کاملSemi-Supervised Learning Based Prediction of Musculoskeletal Disorder Risk
This study explores a semi-supervised classification approach using random forest as a base classifier to classify the low-back disorders (LBDs) risk associated with the industrial jobs. Semi-supervised classification approach uses unlabeled data together with the small number of labelled data to create a better classifier. The results obtained by the proposed approach are compared with those o...
متن کاملEstimate Unlabeled-Data-Distribution for Semi-supervised PU Learning
Traditional supervised classifiers use only labeled data (features/label pairs) as the training set, while the unlabeled data is used as the testing set. In practice, it is often the case that the labeled data is hard to obtain and the unlabeled data contains the instances that belong to the predefined class beyond the labeled data categories. This problem has been widely studied in recent year...
متن کاملLearnability and Stability in the General Learning Setting
We establish that stability is necessary and sufficient for learning, even in the General Learning Setting where uniform convergence conditions are not necessary for learning, and where learning might only be possible with a non-ERM learning rule. This goes beyond previous work on the relationship between stability and learnability, which focused on supervised classification and regression, whe...
متن کاملAn Improved Supervised Learning Algorithm Using Triplet-Based Spike-Timing-Dependent Plasticity
The purpose of supervised learning with temporal encoding for spiking neurons is to make the neurons emit arbitrary spike trains in response to given synaptic inputs. Recent years, the supervised learning algorithms based on synaptic plasticity have developed rapidly. As one of the most efficient supervised learning algorithms, the remote supervised method (ReSuMe) uses the conventional pair-ba...
متن کاملBeyond LDA: Exploring Supervised Topic Modeling for Depression-Related Language in Twitter
Topic models can yield insight into how depressed and non-depressed individuals use language differently. In this paper, we explore the use of supervised topic models in the analysis of linguistic signal for detecting depression, providing promising results using several models.
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2018